Evaluating performance in systems with heavy tailed input: a quantile based approach
نویسنده
چکیده
One of the key invariants in computer and communication systems is that im¬ portant characteristics follow long-or heavy-tailed distributions. This means that the tail of these distributions declines according to a power law. Hence, the probability for extremely large values is non-negligible. For example, such distributions have been found to describe the size of web objects or the pro¬ cessing latencies in computer and communication systems. As a consequence, there is a need to employ such distributions when evaluating such systems with synthetic workloads. However, sampling from such distributions to generate workloads implies that the system under evaluation remains in transient state over all periods of time that are feasible for performance evaluations. Con¬ sequently, frequently-used statistics for performance evaluation, such as the average of the system output, do not converge. In this thesis we move away from evaluation using statistics such as the average, which describe the expected behavior of the system in all cases, and take the step towards evaluation using statistics such as quantiles, which de¬ scribe the behavior in a given percentage of cases. Such quantiles of the sys¬ tem output do not depend on the extreme tail of the output distribution. We therefore address the problem of whether employing quantiles can enable per¬ formance evaluations within periods of time that are feasible in practice for performance evaluations. Quantiles have a natural interpretation to statistically characterize the sys¬ tem performance. If e.g. a system offers a web service, the 99-th percentile of the latencies of downloads can statistically characterize the system perfor¬ mance from a user's view, since 99% of downloads terminate within times smaller than this quantile. If converged, such quantiles can be used to derive statistical guarantees for the system performance. Similar statements hold for l 11 Abstract system components such as servers and networks. Applying probability theory, we show that statistics of quantiles converge considerably faster than other frequently-used performance evaluation statis¬ tics if the underlying distribution is long-or heavy-tailed. Based on this theory, we give a method which enables to evaluate system performance under long-or heavy-tailed input within periods of time that are practically feasible. We validate the proposed method by applying it to a simulation-based evaluation of the network performance of systems that offer web services. We show that the proposed method has further applications to other prob¬ lems that require performance evaluation with synthetic workloads which are generated by sampling …
منابع مشابه
Tail Exponent Estimation via Broadband Log Density-Quantile Regression
Heavy tail probability distributions are important in many scientific disciplines such as hydrology, geology, and physics and therefore feature heavily in statistical practice. Rather than specifying a family of heavy-tailed distributions for a given application, it is more common to use a nonparametric approach, where the distributions are classified according to the tail behavior. Through the...
متن کاملQuantile Optimization for Heavy-tailed Distributions Using Asymmetric Signum Functions
In this paper, we present a provably convergent algorithm for computing the quantile of a random variable that does not require storing all of the sample realizations. We then present an algorithm for optimizing the quantile of a random function which may be characterized by a heavy-tailed distribution where the expectation is not defined. The algorithm is illustrated in the context of electric...
متن کاملEstimating extreme quantiles under random truncation
The goal of this paper is to provide estimators of the tail index and extreme quantiles of a heavy-tailed random variable when it is righttruncated. The weak consistency and asymptotic normality of the estimators are established. The finite sample performance of our estimators is illustrated on a simulation study and we showcase our estimators on a real set of failure data. keywords: Asymptotic...
متن کاملRobust Scatter Matrix Estimation for High Dimensional Distributions with Heavy Tails
This paper studies large scatter matrix estimation for heavy tailed distributions. The contributions of this paper are twofold. First, we propose and advocate to use a new distribution family, the pair-elliptical, for modeling the high dimensional data. The pair-elliptical is more flexible and easier to check the goodness of fit compared to the elliptical. Secondly, built on the pair-elliptical...
متن کاملRobust Estimation of Transition Matrices in High Dimensional Heavy-tailed Vector Autoregressive Processes
Gaussian vector autoregressive (VAR) processes have been extensively studied in the literature. However, Gaussian assumptions are stringent for heavy-tailed time series that frequently arises in finance and economics. In this paper, we develop a unified framework for modeling and estimating heavy-tailed VAR processes. In particular, we generalize the Gaussian VAR model by an elliptical VAR mode...
متن کامل